Detecting Events in a Million New York Times Articles

نویسندگان

  • Tristan Snowsill
  • Ilias N. Flaounas
  • Tijl De Bie
  • Nello Cristianini
چکیده

We present a demonstration of a newly developed text stream event detection method on over a million articles from the New York Times corpus. The event detection is designed to operate in a predominantly on-line fashion, reporting new events within a specified timeframe. The event detection is achieved by detecting significant changes in the statistical properties of the text where those properties are efficiently stored and updated in a suffix tree. This particular demonstration shows how our method is effective at discovering both shortand long-term events (which are often denoted topics), and how it automatically copes with topic drift on a corpus of 1 035 263 articles.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lean six sigma process improvement in specimen receiving to improve stat chemistry turnaround times

  Objective: As a consequence of stat turnaround times (TATs) chronically exceeding 60 minutes, our laboratory was facing pressure to divert limited resources toward the implementation of an emergency department satellite laboratory. Peer-reviewed literature in clinical laboratory quality assurance and improvement indicates that between 60-70% of errors occur at the pre-analytical level.  Thus...

متن کامل

Global News and Awareness: An Examination of El País and The New York Times and their Relation to Public Knowledge and Opinion Levels

This research examined the content of articles about Brazil in The New York Times and El País and the correlation between news coverage and students’ knowledge and opinion levels on current events in Brazil. Content analysis revealed that American news coverage of Brazil in The New York Times had more depth and breadth than Spanish coverage in El País. Questionnaires distributed to University o...

متن کامل

Metadiscourse Markers: A Contrastive Study of Translated and Non-Translated Persuasive Texts

Metadiscourse features are those facets of a text, which make the organization of the text explicit, provide information about the writer's attitude toward the text content, and engage the reader in the interaction. This study interpreted metadiscourse markers in translated and non-translated persuasive texts. To this end, the researcher chose the translated versions of one of the leading newsp...

متن کامل

Recency is good: expanding with fresh news improves event detection in Twitter

Twitter is a popular microblogging site that is a good source of real-time information. Detecting events in Twitter is an ongoing research effort and a fundamental task is clustering tweets according to which (news) event they describe. Document expansion can improve this clustering, especially for Twitter, given that tweets are short. While document expansion using external corpora has been ar...

متن کامل

The Role of Culture in Sports Sponsorship: an Update

Nowadays sponsorship is an important part of sports events. Sports sponsorship offers more benefits, more variety and also it’s a more powerful form of marketing. In general, sponsorship holds a unique position in the marketing mix because it is effective in building brand awareness, provides different marketing platforms and valuable networking and hospitality opportunities. Sponsorship market...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010